# README

## Project Overview
This repository contains a fully reproducible workflow for harmonizing individual-level survey data (WVS & ISSP) with country-year macro indicators and estimating hierarchical linear models (HLM). The code is organized into stages, from data preparation to model estimation and robustness checks.

## Repository Structure
```
├── code/
│   ├── paths.py           # Path configuration
│   ├── helpers.py         # Utility functions for cleaning and harmonization
│   ├── stage1_setup.py    # Environment setup and imports
│   ├── stage2_issp.py     # ISSP data cleaning and harmonization
│   ├── stage3_macro.py    # Macro-level data preparation (V-Dem, GII, GDP)
│   ├── stage4_merge.py    # Merge individual & macro datasets
│   └── stage5_model.py    # Model estimation and robustness checks
├── data/                  # Input raw data (not tracked)
├── output/                # Intermediate and final .parquet files
└── README.md              # This usage guide
```  

## Prerequisites
- Python 3.8+
- Install dependencies (e.g.):
  ```bash
  pip install pandas numpy pycountry statsmodels matplotlib seaborn pyreadstat
  ```
- Raw survey files (WVS & ISSP) and macro CSVs placed under `data/` per instructions in `paths.py`.

## Configuration
1. Open `code/paths.py` and set the base directories for:
   - `RAW_WVS_PATH`, `RAW_ISSP_PATH`
   - `RAW_MACRO_PATH`
   - `OUTPUT_PATH`
   - `PLOT_PATH`

2. Verify that file names and folder structure match your local setup.

## Usage: Run the Workflow
Execute the scripts in sequence from the project root:

1. **Stage 1: Setup**
   ```bash
   python code/stage1_setup.py
   ```
2. **Stage 2: ISSP Harmonization**
   ```bash
   python code/stage2_issp.py
   ```
3. **Stage 3: Macro Data**
   ```bash
   python code/stage3_macro.py
   ```
4. **Stage 4: Merge Datasets**
   ```bash
   python code/stage4_merge.py
   ```
5. **Stage 5: Modeling**
   ```bash
   python code/stage5_model.py
   ```

Each script will read inputs from `data/`, process, and write outputs to `output/`.

## Outputs
- `output/issp_combined.parquet` — Harmonized ISSP data
- `output/final_macro_combined.parquet` — Merged V-Dem, GII, GDP data
- `output/final_analysis_dataset.parquet` — Final analysis dataset
- `output/model_summary.txt` — Model results summary
- Plots saved under `plots/`

## Extending and Customization
- Modify variable lists, model formulas, or add new robustness checks in `stage5_model.py`.
- Update helper functions in `helpers.py` for additional cleaning rules.

## License & Citation
Include your license and citation information here.

